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Overview 

Why performance assessment? 

The main curricular emphasis in the second International Assessment of 
Educational Progress (lAEP) was on mathematics and science. The extensive 
use of pencil-and-paper tests in the main lAEP assessments made it possible to 
achieve good coverage of the knowledge and skills which could be assessed 
using such test instruments.^ However, an analysis of the mathematics and 
science curricula in most of the countries involved in lAEP showed that they 
included at least some skills and processes which could not be assessed 
adequately \\'ith pencil-and-paper tests alone. 

The potential value of performance assessment as a supplement to 
pencil-and-paper tests was recognized by the lAEP assessment developers. 
Experience in the United Kingdom with national educational monitoring by the 
Assessment of Performance Unit (APU) in England and Wales and the 
Assessment of Achievement Programme (AAP) in Scotland had demonstrated 
that some types of performance assessment were feasible, in practical and cost 
terms, in national surveys of student attainment. 



^Archie E. Lapoiiite. Nancy A. Mea*^, and Janice M. Askew. Learning Mathematics. 
Princeton. NJ: Educational Testing Service. 1992. 

Archie E. Lapointe. Janice M. Askew, and Nancy A. Mead. Learning Science. Princeton, 
NJ: Educational Testing Service. 1992. 
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Given the UK experience and the desirability of extending the 
curriculum coverage in lAEP, it was decided to include a limited, optional 
component of performance assessment in the 1991 survey. The assessment wa 
developed for 13-year-old students only and included mathematics and science 
tasks to enable lAEP participants to experiment with performance assessment 
in an international context. 



What type of performance assessment? 

The experimental nature of the performance assessment in lAEP 
required that the approach used be based on existing best practice. 
Consequently, the approach and the test materials used drew heavily on the UK 
experience but added essential amendments to meet the needs of an 
international study. For instance, they had to be robust enough to be valid in a 
variety of different curricular contexts and to be capable of operation with 
limited equipment and materials by staff who had no prior experience in this 
type of assessment. 



The approach decided on was a series (or circuit) of stations, each 
involving a short task (or tasks), which students would carry out under the 
supervision of a trained assessor. The tasks required students to demonstrate 
practical skills, such as measurement or observation, and provided a more 
realistic context than a written test for assessing cognitive skills, such as 
inferring or hypothesizing. The activity at each station was designed to be 
completed by students in about five to eight minutes. Kits of standard pieces 
of equipment and materials, including master copies of diagrams, were supplied 
to all of the countries participating in the performance assessment. 
Standardized instruction manuals and scoring guides were supplied for 
administering and scoring the tasks. 

Procedures and tasks for the performance assessment were pilot-tested 
in May 1990. The final design of the performance assessment had two circuits, 
each containing eight stations. One circuit consisted of mathematics tasks and 
the other of science tasks. The two circuits could be administered by one 
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assessor in parallel or in series, each accommodating separate samples of six 
students. vStudents were allowed to spend as much time at each station as they 
needed. Assessors only asked individual students to move on if they appeared 
to have completed all that they could. This flexibility and the availability of 
more stations in a circuit than students working on it was mtended to avoid 
queuing for stations while providing time for sti^dents to deliver their best 
performance. 

The outcomes included artifacts, drawings, and written responses, a total 
of about 36 outcomes from the 16 stations. Only two outcomes, both in f.he 
science circuit, were marked on the spot by the assessor. The remainder v^ere 
scored later by trained scorers. 

What was assessed? 

The performance tasks required students to apply concepts, observe, 
measure, manipulate equipment and materials, and record and interpret data. 
The tasks were designed to coirespond with aspects of the main assessment 
framewc.k and the curricula of the participating countries. However, 
classification by content and process was complicated because many of the 
tasks required a series of steps and could be solved in a number of ways. The 
outcomes of the performance assessment tasks reflected only a portion of the 
overall assessment framework, as do all tests of broad curricular areas. 
However, it must be borne in mind that only a subset of the elements in the 
overall framework included practical skills or skills that benefit from a practical 
context. 

The mathematics tasks tended to concentrate on aspects of Measurement 
and Geometry, which together constituted 35 percent of the agreed-upon lAEP 
assessment framework. This was for two main reasons: elements of these two 
aspects of mathematics could be assessed only by using performance tasks; and 
focussing on these elements made it possible to obtain measures of 
performance on sets of related skills. The tasks also assessed conceptual 
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understanding, procedural knowledge and problem solving, which represented 
the other dimension of the assessment framework.*^ 

In science, the tasks were drawn mainly from aspects of the Physical 
Sciences and Nature of Science, which together constituted 50 percent of the 
overall assessment framework. This was partly because of restrictions on 
sending biological material, such as potting soils or seeds, between countries 
and partly because few good tasks in the Life or Earth and Space Sciences 
were available. The tasks did, however, assess the other dimensions of the 
assessment framework: knows facts, concepts and principles; uses knowledge to 
solve simple problems; and integrates knowledge to solve more complex 
problems. 

It is worth noting here that performance assessment frameworks m.ay 
vary more from country to country because of differences in the countries' 
curricula than do assessment frameworks for traditional pencil-and-paper tests. 
For instance, the main categories for science at age 13 in the UK national 
nonitoring programmes were: 

APU - England and Wales AA? - Scotland 

Using symbolic representations Observing** 
Using apparatus and measuring instruments" Measuring" 
Using observation" Handling information" 



" Aspects in which some form of performance assessment is conducted. 

Such differences have implications for the emphasis given to 
performance assessment, as opposed to pencil-and-paper tests, and for the type 
of tasks used. 



^Tenter for the Assessment of Educational Progress. The 1991 lAEP Assessment: 
Objeciives for Mathematics, Science, and Geography, Princeton, NJ: Educational Testing 
Service, 1991. 



Interpretation and application 
Design of investigations 
Performance of investigations" 



Using knowleage" 
Using simple procedures 
Inferring" 
Investigating*" 
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y\fho ivas tested? 



Nine of the lAEP participants, four countries and five Canadian 
provinces (two including separate English- and French-speaking populations) 
aditiinistered the performance assessment tasks. After careful consideration the 
United States decided not to participate in this international experiment but 
rather to evaluate the results of the assessment before taking part in future 
comparative studies of this type, 

A subsample of about one quarter of the schools involved in the lAEP 
pencil-and-paper testmg were randomly selected for the performance 
assessment. Likewi.se. the performance tasks were administered to a subsample 
of the students m tho.se schools who had taken the lAEP pencil-and-paper tests 
to ensure some common base of experience and permit comparison of written 
test and performance task results. About two-thirds of the ordinal .students in 
the selected schools took part in the performance assessment. The participants 
and their achieved samples were: 



''^^'■"■^^'P^n's Number Assessed 



Number Assessed 



Canada 
Alberta 



in Mathematics in Science 



J /V 10() 

British Columbia -.(-,] 

Nova Scotia 258 V-o 

Ontario-English 336 

Ontario-French 305 

Saskatchewan-English 28\ 

Saskatchewan-French 2"'4 



329 
319 
377 



Encland' ' t^, 218 

13] 

283 



Scotland 



Soviet Union- 370 
Taiwan 



316 



(47 2?cenT!''"the f T' f^-t participation rate tor the original sample was ^■erv low 
(47 percent,, .so the results for England presented in this report may reflect nonresponse bia^! 



In addition to their limited size, it should be rioted that some countries' 
school samples were restricted geographically in order to contain costs. For 
instance, the schools in Scotland were selected from those in the central belt of 
the country only. 

What information can be reported? 

The performance assessment experiment provided two types of 
information. First, although the student samples were limited and, therefore, 
could not produce accurate measures of performance of total country (or 
province) populations, the process did provide a rich source of information on 
how a sample of students perform on practical tasks. Second, the experiment 
also provided information about the strengths and weaknesses of the particular 
approach used. 

How well did students perform? 

There are three general points worth emphasizing before the findings 
are considered in more detail, 

• Scores varied widely from task to task, suggesting that 
the meai.ures tap a range of skills and knowledge. 

• Scores on the various tasks varied significantly from 
country to country (and from province to province) in 
systematic ways, indicating real differences in 
performance between the various populations. 

• The relative performances of countries and provinces 
were generally different from those identified by the 
written tests covering related curricular areas. This 
suggests that using "hands-on" methods of assessment 
allowed students to demonstrate their skills in ways that 
were not possible with traditional paper-and-pencil tests. 



The following describes some of the main findings from ihc 
performance assessment. More detailed results are provided in the second and 
third parts of this report. 

• In measurement skills, the main question was not one 
of accuracy of measurement hut whether decisions on 
what to measure (or how to calculate answers from 
measurements) were correct. Across all participating 
countries and provinces, about 40 percent of students 
measuring the perimeter of a rectangle provided correct 
answers within plus or minus 4mm. Those who were 
incorrect tended to be a long way out, probably due to 
over- or under-counting the rectangle's sides or by 
misreading cm and mm on the ruler. The task of 
estimating irregular areas using a grid square produced a 
similar outcome, with reasonably accurate answers 
produced by most students but wildly inaccurate ones by 
the rest. Those giving wrong answex*s when measuring 
angles with a 180° protractor generally gave the size 
of the "reverse" angle (i.e., in the case of the acute and 
obtuse angle, 180° minus the size of the angle, and in the 
case of the reflex angle, 360° minus the size of the 
angle). 

• In problem-solving tasks based on geometry, most 
students performed well. Difficulties in producing and 
handling shapes v/ere few, although more difficulties 
were encountered with more complex shapes. However, 
in a task requiring the use of a pinboard (or geoboard) to 
identify different-size squares, the square with a 
"diamord" orientation was overlooked by many 
students. 

• One problem-solving task based on weighing showed that 
students had difficulty weighing accurately. However, of 
greater interest was whether students used a precise 
method for solving the problem or an estimation 
approach. In most cases, the latter predominated. 
For details of the range of strategies adopted for this task, 
see the Appendix. 



Three of the science tasks required objects to be 
categorized in terms of electrical conductivity, 
magnetism, and appearance, respectively. The first two 
of these required students to carry out tests on the 
objects. The categorization was accomplished 
successfully by most students and most also provided a 
satisfactory explanation for their categorization. It 
seemj, therefore, that most 13-year-olds can carry out 
simple tests systematically and that they have at least 
a basic understanding of conductivity and magnetism. 
In the third task, most students could categorize correctly, 
but only half or fewer could provide a satisfactory 
explanation. 

Two of the science tasks required students to follow 
instructions provided mainly in diagrammatic form. In 
both tasks high success rates were achieved and the 
required artifacts, an electrical circuit and filtering 
apparatus, were assembled correctly by most pupils. It 
seems that by age 13, most students can utilize 
scientific apparatus, even when they are not familiar 
with it, when clear instructions are provided. In this 
case the less able students were probably helped by the 
very limited reading requirements of the tasks. 

One of the science tasks was a simple investigation in 
which the starch and/or glucose content of three solutions 
was determined by using chemical indicators. Across the 
participating countries and provinces, about two-thirds of 
the students identified the glucose solution, fewer than 60 
percent identified the starch solution and less than half 
recognized that the third solution contained both. False 
positives proved to be a problem, perhaps because 
some students regarded any change in the indicator, 
even intensity of colour, as the required change. This 
probably reflects unfamiliarity with the use of 
chemical indicators. Problems in following procedures 
were minimal. 
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• A task designed to assess whether students could 
differentiate between statements of fact and deductions 
involved two small plastic objects, each in its own jar, 
one object floating and one submerged in a clear liquid. 
Most pupils selected the two factual statements from 
the five statements provided but more than half also 
selected a deduction, that the submerged object was 
heavier. In fact, the deduction was incorrect, as the 
objects were identical but the density of the liquids 
differed. This result mirrors a tendency 13-year-olds 
displayed in similar written test questions, where they 
made plausible deductions when asked to make only 
observations. 

• A task designed to test visual, auditory and olfactory 
observations involved dissolving a fruit-flavoured tablet 
in water and recording what happened. Most students 
noted changes in the tablet's size and in the colour of the 
water and that gas (i.e., bubbles) was emitted. Only a 
minority noted that the tablet moved, that the 
transparency of the water decreased, that there was a 
fizzing sound, and that there was a smell of fruit. The 
results may reflect students' interpretation of the 
word "observe" in the question as being a visual cue 
and p rhaps also a tendency to regard their visual 
senses as predominant in a scientific context. 



What lessons did we learn? 

The main lesson was that this form of performance assessment can 
be used reliably in international comparative studies, although at an 
estimated cost three to four times greater than for an equivalent number 
of written test questions. With careful task development and pilot-testing, 
tight quality control of equipment and materials, and clear insjtructions, 
performance tasks can be designed and administered to provide standardized 
results in curricular areas which cannot be assessed " ^equately with written 
tests. The emphasis educators place on these curricular areas will dictate 
whether ihis type of performance assessment justifies the extra resources 
required. 

ERIC ^ 
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Equipment and materials. The equipment and materials used in the 
performance assessment were constrained in a number of ways. Since each 
participant needed to purchase several kits which included the necessary 
equipment and materials, it was necessary to use mostly low-cost items. The 
need to send kits to various countries required them to be reasonably compact, 
light in weight, and robust. Kits also had to comply with customs and duty 
regulations. In addition to these practical issues, the equipnient had to be 
either familiar to students or simple enough to be used with minimal 
instructions. Also, since schools were asked to provide only flat work surfaces 
and a supply of water, equipment needed to meet these constraints as well. 

How well did we do wj:h respect to these factors? By using 
combinations of short tasks for our assessment, we proved it is possible to 
produce portable kits of reasonably robust equipment. The kits were assembled 
centrally and distributed in appropriate numbers to each participating country, 
to ensure that the equipment used in the assessment was standardized and 
identical in all countries and provinces. 

Even with careful preparation, three major problems arose with 
equipment and materials used in the performance assessment. As a result, a 
few resuhs had to be discarded and some others had to be re-analyzed. 

The first of these problems arose during the reproduction of students* 
booklets in the various countries. This was done by photocopying processes, 
which tend to produce small changes in scale, a feature warned about in the 
performance assessment guidelines. The resulting nonstandard sizes of 
diagrams (in two of the stations) in some countries and provinces required 
reanalysis of their results, using the actual rather than the intended sizes. This 
problem could be resolved in future surveys by supplying all copies of 
diagrams from a central source. 

The second problem was similar in nature. Plastic containers with 
specific volumes marked on them were used in one station and some of these 
proved to be nonstandard, a few by a substantial amount. It was possible to 
take into account the equipment variation in the analysis of the first task in this 
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station, but results from the second had to be discarded. In future surveys, all 
such apparatus should be checked, as significant production variations can 
occur even in something as simple as rulers. 

The third problem concerned the task which required the categorization 
of different types of seeds. As biological material could not be included in the 
kits, each country and province was required to obtain the specified seeds. In 
some cases this proved impossible and, despite some attempts at improvisation, 
it became clear that the results from different testing locations for one group of 
seeds were not comparable. The results were subsequently discarded. 

Two minor difficulties arose related to students' understanding of the 
instructions that accompanied the equipment and material. First, there was a 
translation problem with the students* instructions in the Soviet Union for the 
task called Leaves, This was corrected during the assessment, but the results 
are not comparable with those obtained from other countries. Second, one set 
of instructions used the word "enable," which was misinterpreted by some 
students in one population as "unable" (despite the fact that the sentence 
involved did not then make sense). In this situation, students who provided the 
opposite response along with an appropriate explanation were counted as giving 
correct answers. 

Administration and organization. The performance assessment tasks 
were designed to be administered by assessors running the two circuits either in 
parallel or in sequence and, in practice, they were administered in both ways. 

The assessors in some countries and provinces were members of the 
teaching staff in the sample schools and in other locations, they were not, 
although most were trained teachers. Each assessor received training about the 
performance assessment tasks, their administration and their scoring. 

The administration of the tasks raised few problems beyond those 
already mentioned. In some smaller schools space was limited and the circuits 
had to be set up wherever spac ^ was available. The lack of a water supply in 
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some schools had been anticipated, so testing needs could be met with a large 
container. 

How did teachers and students react? 

The message here was clear and unequivocal. Teachers involved in the 
performance assessment and others who only observed it in progress expressed 
enthusiasm for the approach and interest in the equipment and materials used. 
Many said they were impressed by the tasks and some suggested that such 
activities could and should be integrated into classroom work. In fact, this is 
being done in at least one Canadian province as well as in the UK. In Taiwan 
the performance assessment tasks have since been used with students other than 
those involved in lAEP. 

Teachers who observed students they knew doing the tasks were 
sometimes surprised at how the students tackled them and what they were able 
to achieve. Generally, students were enthusiastic and willing to try all of 
the tasks, regardless of their individual abilities, and in some cases, 
students regarded as low attainers achieved more than was expected. The 
fact that all students could, at least, attempt the tasks and achieve something 
seemed to eliminate the feeling of failure that low-attaining students sometimes 
experience in taking written tests. Secondary analysis of the written test and 
performance assessment data sets should enable us to clarify whether there are 
differences in the relative performance of low-attaining students. 

The reaction of students was almost universally favourable. Many 
assessors reported that students said they had enjoyed doing the tasks and 
remarked on how well motivated they appeared to be in undertaking them. 
The lack of any significant problems in completing tasks, when they were 
essentially untimed, may well be attributable to the positive attitudes of the 
students and to their self-discipline in this type of activity, as well as to the 
work of the assessors. 
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What can students do in performance assessment? 

The following section of the report describes what students achieved in 
the performance assessment tasks. In order to ensure that the results can be 
understood readily, full details of each task are provided. It should be noted, 
however, that the text of the tasks has been abbreviated and the diagrams have 
been altered in size and presentation. In keeping with the experimental nature 
of the performance assessment, information is also provided on the few tasks 
which did not work as expected. 



The tasks are described in the following order: 



Content Area Name of Task 



Skills Assessed 



Mathematics 



Perimeter 

Ticket 

Angles 

Leaves 

Pinboard 

Triangles 

Scissors 

Clay 

Water 



Measurement skills 

Measurement skills and problem solving 
Measurement skills 
Measurement skills 

Geometry concepts and problem solving 
Geometry concepts 
Geometry problem solving 
Measurement skills and problem solving 
Measurement skills 



Science Light-up Physical science concepts and skills 

Circuit Physical science concepts and skills 

Filter Nature ot science skills 

Magnet Physical science concepts and skills 

Indicators Physical science skills 

Float Nature of science skills 

Tablet in Water Nature of science skills 

Seeds Life science problem solving 



The results presented for each task that follow are weighted percentages 
of correct responses or responses of a particular type. Next to each printed 
statistic in parentheses is an estimate of sampling error (standard error). It is 
especially important to consider the imprecision in the estimates when 
comparing two populations with similar results. 



1-0 



Mathematics Tasks 
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PERIMETER 
Task Descriptor 

To measure the perimeter of a rectangle. 
Equipment/Material 

A centimetre/mm ruler and a rectangle (119 mm by 72 mm) printed in the student's booklet. 



Student instructions 

Measure the distance around the rectangle to the nearest mm. 
Scoring Scheme 

Credit for correct measurement ± 4mm. 
Probiems 

In some countries photocopying expanded the size of the rectangle to be measured. To allow 
for this, all answers within ± 4mm of the actual size were given credit. 

Comments 

There was a wide range of performance among the Canadian provinces, with scores 
ranging from 24 to 52 percent, whereas the scores in the other countries were more similar. 

• In Ontario and Saskatchewan, differences in the performance of the Enghsh- and French- 
speaking populations favoured the latter. 

A sizable number of students (17 percent across all participating countries and provinces) 
gave the measurement in cm although mm was specified in the instructions. 
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Percentage of Correct Responses (with Standard Errors) 



Alberta 
British Columbia 
Nova Scotia 
Ontario-Engiisii 
Ontario-French 
Saskatchewan-English 
Saskatchewan-French 
England 
Scotland 
Soviet Union 
Taiwan 



Q 



10 



M4 (3.8) 



-•37 (4.4) 



-^30(4.0) 



-•24 (2.4) 

^31 (2.6) 

• 38 (3.2) 



20 



-•52 (0.0) 



Ml (5.0) 
—•45 (2.4) 



-•39(2.9) 
—•41 (3.2) 



30 



40 



50 



60 



70 



ao 



90 



100 



-• Perimeter ± 4mm 
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TICKET 



Task Descriptor 

To determine what is the greatest number of tickets (rectangles) that can be cut from a sheet of 
paper. 

Equipment/Material 

One ticket cm by 7 cm) and a blank sheet of paper (24 cm by 21 cm). (Lines have been 
added to show the solution.) 



Student Instructions 

Paul got 12 tickets by cutting up his sheet of paper. Julie managed to get 13 tickets from her 
sheet. 

Find the greatest number of tickets that can be made from a sheet of paper. Draw lines on the 
sheet of paper to show how it would be divided up to make this number of tickets. 

Scoring Scheme 

Credit given for answer 14 and for drawing lines as shown above. 
Comments 

• Scores on this task were quite low - the highest being just over 30 percent indicating that 
14 tickets could be made. Fewer were able to draw the lines showing the correct solution. 

• Across all participating countries and provinces, 9 percent of students thought the 
maximum number of tickets that could be made was 13 and 47 percent thought only 12 
could be made. (Both of these numbers were mentioned in the task instructions.) 
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Percentage of Correct Responses (with Standard Errors) 



Alberta 
British Columbia 
Nova Scotia 
Ontirio^Engllsh 
Ontarlo-Franch 
Saskatchewan-Enolish 
Saskatchewan- Franch 
England 
Scotland 
Soviet Union 
Taiwan 



— #27 (3.6) 
-02S (3.7) 



-•31 (1.2) 



-026(1.9) 



► 25 (2.6) 



-O 16 (2.4) 



"#22 (2.3) 



-017(2.0) 



-#23 (2.5) 



-O 18 (2.2) 



>24 (3.1) 



-O20 (3.6) 



—#28 (0.0) 
-0 27 (0.0) 



-#27 (2.5) 



-021 (2.0) 



-#29 (2.5) 
029 (2.2) 



-#24 (2.3) 
-025 (2.5) 



-#19(3.5) 



10 



-0 16 (2.5) 

' ' ' I ' " 
20 



30 



40 



50 



60 



70 



80 



00 ioa 



M 4 tickets 



) Correct lines drawn 
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ANGLES 



Task Descriptor 



To measure three angles using a 18Cf protractor. 
Equipment/Material 

A 180° protractor and three angles provided in the student's booklet (120°, 58° and 280° 
respectively). 



Student Instructions 

Measure the angles to the nearest degree. 

Scoring Scheme 

Credit for correct measurement ± 2". 

Comments 

• Results were similar on the obtuse and acute angles, ranging from 47 to 72 percent correct. 
Performance was much lower on the reflex angle, ranging from 22 to 40 percent correct. 

• Obtuse Angle A: The most common wrong answer, given by 21 percent of students across 
participating countries and provinces, was 60° (i 2"), presumably because they measured 
the "reverse" acute angle. 

• Acute Angle B: The most common wrong answer, given by 11 percent of all students, was 
122° (± 2"^), presumably because they nneasured the "reverse"' obtuse angle. 

• Reflex Angle C: The most common wrong answer, given by 28 percent of all students, was 
80° {± 2"^), presumably because they measured the "reverse" acute angle. Twelve percent 
of students answered 100° (± 2"^), perhaps because they measured the acute angle (80°) and 
subtracted from 180" instead of 360^ 






ERIC 
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Percentage of Correct Responses (with Standard Errors) 



Alberta 

British Coiumliia 

Nova Scotia 

OntaHo-Engiish 

Ontario-French 

Saskatchewan-English 

Saskatchewan-French 

ERQiand 

Scotland 

Soviet Union 
Taiwan 



-•64(3 6) 



-<:)61 (3.6) 



i.)22 (4 6) 



>65 (2 9) 

072 (2.1) 



-^^)33(1 7) 



► 63 (4 7) 



-O 47 (3.5) 



-023 (4 9) 



-028(2.5) 



-022 (3.4) 



-e26 (4.3) 



-0 33 (0.0) 



-0 40 (8.3) 



-^37 (3,4) 



-<D33 (2.9) 



• 59(2.6) 
-061 (2.7) 



-•63 (3.0) 
072 (2.4) 



-• 59 (4 3) 

OC7 (2.2) 



-• 72 (0 0) 



-069 (0.0) 



■HI 63 (4.6) 
068(4.4) 



-HI 65 (2.0) 
—065 (3.6) 



-•63(3.2) 



-058(1.5) 



-•68 (2.2) 



' I ■ ' ■ ■ I ■ ■ ' I ■ — ■ 1 ■ ' ' • r ' ' ' 1 ' ' ' ' r ' ' ' I ' 

10 20 30 40 50 60 70 80 



90 100 



-# Angle A 



> Angle B 



^ Angle C 



*No Information for Anglo B and Angle C 
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LEAVES 
Task Descriptor 

To find the area of two irregular shapes using a grid square, 
Eqi?ipment/Matsrlals 

TWO drawings of leaves (with areas of 21cm^ and 48cm^ respectively), tracing paper and 
transparent gnd marked in square centimetres, 





Studtnt Instructions 



Find the area of the leaves, using the grid. You may use the tracing paper on top of the grid 
mark off the boxes that cover the leaf totally or partially. 



to 



Scoring Schsme 

Credit given for correct areas ± 4cm^ 
Problsms 

When master copies of the leaf drawings were photocopied, there were some minor variations 
m the sizes of the copies produced. Because of this, credit has been given for answers within a 
wider range than onginally mtendcd. Due to a translation problem, the student instructions in 
the Soviet Union omitted 5ome relevant information. Although this was corrected during the 
assessment, the results from the former Soviet Union are not comparable to those from other 
countries and provinces. 

Comments 

• The studems' scores tended to be higher for the smaller leaf, 

• Performance differences among countries and provinces were small, in general, with the 
co"ect°" '^''^ ""^'"^ ^'"^ '^^ '° ''^ P*f«"' 



<?6' 
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Percentage of Correct Responses (with Standard Errors) 



AllMfta 
Brititii Colimbia 
Nflvi Scotia 
Oatsri(h€iiflisli 
Ontorio-Frandi 
SaiMB«Mmh€iiolisli 



SovMUilott 
TafVBi 



-•63(4.1) 



HD54(3.4) 



— SeZ (2.3) 
-065(2.5) 



» 59 (3.5) 



-O 48 (3.6; 



— #50(3.3) 

-O 48 (3.5) 



» 59 (2.8) 



-O 48 (3.2) 



©52(2.9) 

-046(2.0) 



HO 51 (0.0) 



162(0.0) 



167(9.2) 



-063 (4.6) 



» 65 (3.3) 



-059 (2.4) 



-•41 (3.5) 



HD 35 (4.5) 



-•70(3.0) 



-063(3.0) 



11 



21 



36 



41 



71 



IN 



)|jMf A 
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PINBOARD 
Task Descriptors 

To construct on a nine-point pinboard (or geoboard): (1) a triangle with the largest possible 
area, (2) a five-sided shape, and (3) three squares of different sizes. 

Equipment/Material 

A nine-point pinboard (or geoboard), rubber bands, and three nine-point grid?, printed in the 
student's book. (Lines have been added to show examples of correct answers.) 




Five-sided shape 




Large Small Diamond 

square square 

Student Instructions 

Use the rubber bands to make the required shapes on the pinboard, then draw the shapes on the 
three grids in your booklet. 

Scoring Scheme 

Credit given for: (1) drawing either version of the largest triangle, (2) drawing any five-sided 
shape, including side-by-side triangles, and (3) drawing a large square, any small square, and a 
diamond square. 

Comments 

• Scores were very high for the largest triangle, the large square, and the small square and 
only slightly lower for the five-sided shape. 

• In the identification of three different squares, there was a markedly poorer performance on 
the diamond square, presumably because of its atypical orientation. 
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Percentage of Correct Responses (with Standard Errors) 



Albfrta 

SritUti Columbia 
Novi Scotift 
Ontario-Engitih 
Ontirlo-French 
Saskatch twin -English 
Saskatchawan-French 
Enfliand 
Scotland 
Soviat Union 



Taiwan 



-050 (4.5) 



-h5)54(2.7) 



-0 46(3.6) 



-0 46 (3.1) 



-048 (4.31 



—053(5.4) 



-0 67 (0.0) 



-061 (3.9) 



562 (2.9) 



-0 89 (3.0) 



-#95(1 6) 

-095 (1.5) 



-#96(1 1) 

— 098 (0.5) 



-#95(1 0) 

— 0 97 (0 7) 



-#96(0.9) 

— 097 (0.8) 



»93 (2.4) 
094 (1 5) 



-# 97 (0 0) 



-0 86(0.0) 



-#97 (1.2) 
099(1 0) 



-#95(1 2) 

097 (0.9) 



-#03(2.1) 

— 095(1.2) 



-0 77 (2.8) 



-#96(1 5) 



-090(1.6) 



-048(4.1) 



n r 1 ■ ■ ■ ■ I ' ' I — I " " I ' 

10 20 30 40 50 60 70 80 90 100 



} Ljrg« Squar* 



) Smail aquart 



) Oiamond squara 



Albarta 

Brnitti Coiumbta 
Nova Scotia 
Ontarlo-EngHsh 
Ontario-French 
Satkatchawan-Engtish 
Satkatchawan-French 
England 
Scotland 
Soviat Union 
Taiwan 



10 



-#95(1 6) 



-O 86 (3.4) 



-#92 (1 9) 

095(1.4) 



-089 (2 4) 
#94 {1 6) 



-083(2.2) 



-#98 (0.7) 



-086 (2 1) 



)85 (2.8) 



-©94 (1 0) 



198 (0.0) 



-#98 (0 8) 



-O 93 (2.2) 
#96(1.6) 



-090(2 4) 
#92 (2.1) 



-O 77 (3.5) 



191 (1 9) 



-088(1 0) 



20 



30 



40 



50 



80 



70 80 



90 100 



I l^gaat tfiangta 



) Flv*-«tdad ahapa 
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TRIANGLES 
Ta8k Descriptors 

To combine four triangular shapes to form: (1) a square, (2) a large triangle, and (3) a six-sided 
shape. 

Equipment/Material 

Four equilateral triangles. (Dotted lines have been added to show the position of the four 
triangles.) 



Square 




Large Triangle 




Six»sided Shape 



Student instructions 

Use the triangles to make the required shapes without leaving any spaces between the triangles. 
Trace around the outside of the shapes in your booklet. 

Scoring Scheme 

Credit given for: (1) a square, (2) a large triangle, and (3) any six-sided shape. 
Commonts 

• Scores were generally high for the square and triangle but lower for the six-sided shape, 
particularly in countries other than Canada. 

• The omission rate on all of these tasks was relatively high, 8 percent across all 
participating countries and provinces. 
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Percentage of Correct Responses (with Standard Errors) 



AllMrta 



Brftitti Colymfiiit 



MonScatia 



OntarioHEngtish 



OntirifhFrfndi 



SMlcitclifwaii«£iH|l)ah 



Sasiatchiwaii-FrtRCh 



Scotiaml 



Smritt Union 



Taiwan 



10 



20 



30 



-#81 (3.3) 



-071 (4.0) 



-078 (3.6) 



180(2.5) 
O 86 (2.6) 



-078(2.3) 



177(2.9) 



074 (3^) 

-071 (2.8) 



-•74 (2.6) 

-080(2.5) 



-071 (3.5) 



-070(2.0) 



>78 (3.1) 
-080(1.3) 



-•77(3,4) 



— 068(4.0) 
-066(3.0) 



#77(0.0) 

O82i0.0) 

-078(0.0) 



-060(4.2) 



-•79 (3.0) 
-O 79 (3.7) 



-064(3^) 



-•86(3.0) 

-086(2.6) 



-0 57(4.0) 



-0 57(3.5) 



40 



50 



60 



70 



•83(4.1) 

-080(4.4) 



-•85(2.4) 

—O 86 (2.5) 

' ■ ■ 1 ' 
90 



SO 



100 



# Square — O Larg« trlangla —0 SIx-akM thapa 
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SCISSORS 
Task Descriptors 

To create three defined shapes t'rom sheets of paper in one cut. 
Equipment/Material 

Drawings of three shapes (see below), a pair of scissors, and sheets of paper. (Dotted lines 
have been added to show the required folds.) 

1 : : 3 



Student Instructions 

Fold a sheet of paper and make one straight cut to produce each of the required shapes. 
Scoring Scheme 

Credit given for the correct cut out(s) and evidence of the correct folding (and no evidence of 
more than one cut). 

Comments 

• Scores were generally quite high on these tasks, ranging from 60 to 84 percent correct, and 
differences among countries and provinces were relatively small. 

• The consistency of the results on the three tasks could indicate that students who grasped 
the correct strategy in the first task tended to complete the remaining tasks successfully. 

• Fifteen percent of students across all participating countries and provinces made more than 
one attempt on these tasks, most of them two attempts. Omission rates were about 6 
percent across all students. 







Percentage of Correct Responses (with Standard Errors) 



Alberta 



British Columbia 



Nova Scotia 



Ontario-English 



Ontario-French 



Saskatchewan-English 



Saskatchewan-French 



England 



Scotland 



Soviet Union 



Taiwan 



165(4.9) 

071 (4 0) 

069 (4.0) 



-4:73 (2.9) 

0 79 (2.9) 

-h2>74 (2.8) 



>61 (4.8) 
-064 (4.0) 



-060 (3.6) 



163(2.3) 

-069 (2.1) 



-062 (3.0) 



»61 (3.3) 

O70 (2.2) 



-062 (2.9) 



-•71 (3.4) 
—072(1.6) 



-069 (2.7) 



-•69 (0.0) 

0 76 (0.0) 

-069 (0.0) 



-#65 (4.7) 

067 (4.9) 

-064(6.8) 



-•65 (4.8) 

O70 (3.9) 

-065 (4.4) 



-•77(2.6) 

084(1.5) 



-080(1 8) 



r/0 (3.3) 

)76 (2.1) 



-070 (2.1) 



10 



20 
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100 



-• side notches 



) Center diamond 



) Two diamonds 
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CLAY 

Task Descriptor 

To make a 15g lump of modelling clay from a larger lump of clay, using a two-pan balance 
and two masses. 

Equipment/Material 

A large lump of modelling clay, a two-pan balance, and two masses, 20g and 50g, respectively. 




20g 



50g 



student Instructions 

Makfi a 15g lump of clay using the materials provided and explain how you did it. (Simple 
mstructions on how to use the balance were provided.) 

Scoring Scheme 

Credit given for a clay lump of 15g x 2g and for describing a feasible method of obtaining it. 
Comments 

• All but 5 percent of the students across participatmg countries and provinces produced a 
lump of clay, but 11 percent of them did not provide a description of how they obtained it. 

• Scores for obtaining clay lumps 15g ^ 2g were gentrally lower in the Canadian provinces 
than elsewhere, the exception being Nova Scotia. 

• There was a strong tendency towards the use of estimation rather than uiathematically 
precise methods in England and the Canadian provinces. 
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Percentage of Correct Responses (with Standard Errors) 



Albtrfa 

Brtttth ColumbiM 
Hon SeoUi 
Ontario-€nQlith 
Ontarlo-frtnch 
Siskatchtwan-Engllsh 
Satldtchiwan-Frtncii 
EiHilantf 

Soviit Union 
Tiharan 



-•32K.5) 



-♦31 (4.8) 



-♦45 (3.4) 



-♦24 (2.4) 



-^25(2.8) 
♦33(8.2) 



-♦31 (0.0) 



> 40 (2.4) 



-♦50(3.8) 



-♦49(4.5) 



-♦50(3.3) 



10 



20 



30 



40 



50 



60 



70 



80 



90 



♦15fl±2g 

Percentage of Students Using Different Methods (with Standard Errors) 



AJbfita 

Brttlth Columlili 
Nova Scotia 
Ontario-English 
OntiriO'Ffonch 
Satkitchawan-Enolith 
Sa tkatdiawa n-Frtnch 
England 
Scotland 
Soviot Union 
Taiwan 



M4(3.3) 



-O 58 (5.3) 



> 23 (4.1) 



-O 44 (4.1) 



M9(2.2) 



-O 44 (2.9) 



M8(3.2) 



M5(1.7) 



-037 (2.9) 
041 (2.8) 



-♦24 (3.8) 



-0 57 (3.7) 



M5(0.0) 



-054(0.0) 



-♦29 (3.9) 



-0 45 (5.7) 



-♦36(4.7) 



-021 (2.0) 



-♦20(1.9) 

-O20 (3.3) 



-022 (3.3) 



-♦42 (3.0) 



10 



20 
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90 



100 
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) Eatlmatlon M«thodt 
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WATER 
Task Descriptor 

To measure the capacity of two plastic containers using a measuring cup. 
Equipment/Material 

A large container filled with water, two smaller containers labelled A and B and a measuring 
cup (500ml graduated in 25ml units). Containers A and B were marked with a black line at 
375ml and 725ml, respectively. 



Student Instructions 

Fill containers A and B up to the black lines from the large container. 

Measure the amount of water in each container in millilitres, using the measuring cup. 

Scoring Scheme 

Credit given for correct volume ± 25ml. 
Problems 

Checking the calibration of the black lines on the containers and measuring cups after the 
assessment revealed significant variations. For container A, these differences could be 
accommodated by giving credit for answers within s 25ml. Because the problem was 
magnified in the measurement of container B, it proved impossible to^make suitable 
adjustments and these results were discarded. 

Comments 

• Scores were generally high at about 90 percent, the exceptions being Taiwan and the 
Soviet Union, where scores were somewhat lower. 

• Non-response rales were very low (1 percent across all countries and provinces) but 
spillage was high! 
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Percentage of Correct Responses (with Standard Errors) 



Alberta 
British Coluinbia 
Nova Scotia 
Ontario-Engiish 
Ontario-French 
Saskatchewan-English 
Saskatchewan-French 
England 
Scotland 
Soviet Union 
Taiwan 



10 



20 



-#87 (2.3) 
•sad 4) 



> 86 (2.4) 
-©88(1.6) 
-•88(1 7) 



-•89(1.4) 
—•90 (0.0) 

-••90(2.4) 



-•89 (2.1) 



-•79 (2.6) 



-•69 (3.0) 



30 



40 



50 



60 



70 



80 



90 



1QQ 



-• 375ml ± 25ml 
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Science Tasks 
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LIGHT-UP 



Task Descriptor 



To categorize objects according to their electrical conductivity by completing an electrical 
circuit; to explain why some objects enable a bulb to light; to predict whether an object in a 
sealed container would enable the bulb to light and to explain why (or why not). 

Equipment/Material — -^^^.^^ ^ ^ 



Student Instructions 

Complete the circuit using the five objects in turn. List those objects that enable the bulb to 
light and explain why. Say whether you think object X would enable the bulb to light and 
explain why. 

Scoring Scheme 

Credit was given for identifying the nail and foil strip as conductors and for giving an 
explanation mentioning one of the following or its equivalent: objects conduct electricity, allow 
electricity/charge to pass, complete the circuit, are metal. Also credit was given for saying 
object X would enable the bulb to light and for giving an explanation as above. 



There was a problem in some Canadian provinces where the word "enable" in the instructions 
was read as "unable." Students who listed the nonconductors and provided an appropriate 
explanation were counted as giving the correct answers. 

Comments 

• Most students, 78 to 93 percent, categorized the objects correctly, but somewhat fewer 
were able to give a valid explanation for what they had done. 

• In four of the countries and provinces, more students recognized the conductivity of object 
X than had categorized the original objects correctly and in two of these countries and 
provinces (and three in total), more students gave a valid explanation for their decision. 



An electrical circuit with a bulb and a 
gap with two contacts which could be 



bridged. Five objects as follows: 
wood strip, plastic strip, nail, foil 
strip and cardboard strip. Also an 
object (piece of copper wire) in a 
sealed, clear plastic box. 




Problems 
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Percentage of Correct Responses (with Standard Errors) 



Albtrti 

Brltlth Columbii 
Novi Scotii 
Onlirio-EnQllth 
Ontirlo-Frtncli 
Satkitchewan-Engllth 
Satkitchawin-Frfnch 
Engiind 
Scotiind 
Sovltt Union 
Taiwan 



Brltlth Columbia 
Nova Scotia 
Ontarlo-Engllth 
Ontario-Franch 
SasJ(atchawan-Engllth 
Satkatchawan-Franch 
England 
Scotland 
Sovlat Union 
Taiwan 



=e 78 (4 3) 
HD76 (2.9) 



-•85(2.4) 



-O 78 (2.5) 



-#92 (2.0) 



-083 (3.5) 



-•85(2.4) 



-077(3 5) 



188(2.2) 



HD75 (2.7) 



-•85(1 9) 



-OeO (2.0) 



-•93(0 0) 



-0 85 (0.0) 
• 88 (2.7) 



-O 75 (8.0) 



-•86(2.6) 



-O80 (3 7) 



> 85 (2.9) 



-O30 (2.6) 



-•09(2.41 



20 



30 



-083 (2.7) 

T 



40 50 
» IdMitlfM nail and foil atrip 



6Q 7Q aO 90 100 
— O Provided valid cxpianation 



191 (2.5) 



-072(1.6) 



-•69(2.5) 



-085(2.8) 
—•96(1.8) 



-O 77 (3.5) 
•84(1 4) 



-075 (2.0) 
•82 (2.4) 



-O70 (2.5) 



-•83(2 4) 



070(3.0) 



-•80(0.0) 



■077(0 0) 



»86 (5 7) 



082 (6.0) 

• 89 (2 6) 

0 85 (2 8) 

•88(1 7) 



076(1.9) 



-•87 (2.^ 



— 069 (3 4) 



I 

20 



40 



\ lder:tified X as conductor 



' r " ' 1 I ' — n n — ^ 

so 60 70 80 90 100 
O Provided valid explanation 
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CIRCUIT 
Task Descriptor 

To construct an electrical circuit as represented in a drawing by selecting appropriate 
components and connecting them correctly. 

Equlpment/Materiai 

Drawing of the circuit and a set of components as listed below. (Number of components 
required to construct the circuit are shown in parentheses.) 



3 batteries (2) 

2 battery holders (2) 

3 bulbs (2) 

2 bulb holders (2) 

1 switch (1) 

6 wires with clips (5) 




Student Instructions 

Use the objects on the card to make up the circuit shown in the drawing. You may not have to 
use all of the equipment. When your circuit matches the diagram, close the switch and see 
what happens. Raise your hand and ask the administrator to check your work. 

Scoring Scheme 

CiCdit was given for the correct positioning of batteries and bulbs, and for using five wires to 
form a closed loop, thus enabling the bulbs to light. 

Problems 

A loose connection in a bulb holder in one of the kits used in Ontario prevented the two bulbs 
from lighting, but students were credited for constructing the circuit correctly. 

Comments 

• Almost all students across participating countries and provinces completed this task 
successfully. 



ERIC 
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Percentage of Correct Responses (with Standard Errors) 



Albirta 

British Coiumfaia 
Nova Scotia 
Ontario-English 
Ontario-French 
Saskatchewin-Engiish 
S>ukatchewan-French 
England 
Scotland 
Soviat Union 
Taiwan 



10 



20 



30 



40 



50 



60 



70 



I 

80 



► 91 (2.5) 
-•93 (2.0) 
#97(1.8) 



>91 (1.8) 
—e 95 (1-5) 
#99 (0.6) 



i95 (0.0) 
-#97(1.3) 



-#98(0.7) 



► 91 (3.0) 
-#93(1.4) 



90 



100 



> Batteries, bulbs, and wires in correct petition 




FILTER 

Task Descriptor 

To set up apparatus for filtering, as shown in a drawing, and to filter some muddy water. 
Equipment/Material 

A ring stand, a funnel, a beaker, and a folded filter paper. Also, a bottle of muddy water. 




Student Instructions 

Set up the apparatus as shown in :he drawing above, put the folded filter paper into the funnel, 
and pour a small amount of muddy water into the funnel. Raise your hand when you have 
gotten some clear water and ask the administrator to check your work. 

Scoring Scheme 

Credit was given for the apparatus being assembled correctly, the filter paper being inserted 
correctly in the funnel, and for any clean water obtained. 

Problems 

In the pilot-testing, filter papers were supplied unfolded and this caused widespread problems, 
but in the final assessment they were pre-folded. 

Comments 

• There was a high success rate. 86 to 100 percent correct, in ass^bling the apparatus; but 
more difficulty was experienced with correctly inserting the filter paper, where success 
ranged from 65 to 89 percent correct. 

Despite problems with the filter paper, many students were still able to obiain some clean 
w^ater. 
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Percentage of Correct Responses (with Standard Errors) 



Alberta 



British Columbia 



Nova Scotia 



Ontario-English 



Ontario-French 



Saskatchswan-Engiish 



Saskatchewan-Frensi! 



England 



Sr wand 



Soviet Union 



Taiwan 



-<5)18(0.0) 



>95 (2.2) 



063 (7.2) 



-078(4.6) 



-•95(1.4) 



-0 83 (4.0) 
—084(4,1) 



► 97(1.5) 



-O70 (3.8) 
080 (3.1) 



> 94 (1.5) 



-074 ^4.0) 
©81 (3.3) 



-•97 (1.5) 



-070 (3.4) 



-086(2.1) 



► 99 (0.5) 



-085 (3.1) 
090(2.7) 



-#99 (0.0) 



-081 (0.0) 



-#96(3.2) 



-O 83 (5.1) 
-083 (5.2) 



-#100(0.3) 



-089 (3.1) 
094(2.4) 



► 86 (1.9) 



-075 (2.6) 
©80 (2.3) 



► 95(1.5) 



-O 67 (3.7) 
—0 68 (3.7) 
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MAGNET 
Ta8k Descriptor 

To use a magnet to identify magnetic and non-magnetic items and then to explain the 
difference between them. 

Equipment/Material 

A magnet and the following seven objects: plastic button, iron or steel washer» steel paper clip, 
iron nail, glass marble, plastic rod and copper coin. 




Student Instructions 

Test the objects with the magnet and divide them into two groups. List the objects in the two 
groups and explain what makes the objects in the two groups different. 

Scoring Scheme 

Credit was given for grouping the objects correctly. Four categories of explanations were 
recorded: namely, that one group was made of iron or steel, that one group was attracted by the 
magnet, that one group was made of iron and steel and was attracted by the magnet, and any 
other explanation. 

Comments 

• Generally students performed the categorization task well, scores ranging from 86 to 95 
percent correct; but 10 percent of the students across all countries and provinces gave 
irrelevant explanations. 

• Omission rates were generally low. but there was a 6 percent omission rate in England. 

• The most frequent explanation for students' categorization was that one group of objects 
was attracted by the magnet; 79 percent of the students across participating countries and 
provinces gave this response. Fewer, between 4 and 30 percent, mentioned iron or steel, 
and this varied considerably among countries and provinces. 
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Percentage of Correct Responses (with Standard Errors) 



AilMfU 
Britifh Columiila 
Novi Scotia 
OntariO'EflQlith 
Ontario-Frtnch 
Sasi[itchtwiii-En9l(«h 
Satkatthtwaft-Fmicli 

Scottand 



Sovitt UfllOR 



Taiwan 



-•90(2.3) 

©93 2) 

#95(1 6) 



193(15) 



-•88(2.2) 
•94(1,8) 



-•86(0.0) 
^93 (2.7) 



-•94(1 9) 



-•90(2.1) 



-•87 (2.3) 
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Percentage of Students Giving Particular Explanations (with Standard Errors) 



AMtita 



Brmtii Coliimkia 



NmS€«tia 



-H34(1.5) 
2(0.9) 



—©4(1.3) 
• 6(1.8) 



CntariO'EnoUtli 



3 1 (0.6) 
2 (0.9) 



Ontario- Frtntk 



-02(1.1) 

—^8(18) 



SiSkatchtwan-EnQlifli 



Satkatchtwin-FraiKh 



England 



Scottand 



Sovitt Union 



Taiwan 



-• 14 (3.4) 



)7(18) 



-•17(3.9) 



)5(1.4) 

•13(0.0) 



-© 10(0 0) 



-•9(4.1) 



-02(1.7) 
i3(1.0) 



3 1 (0.7) 

•9(2.6) 



-09(2.9) 
^11 (1.5) 



-0 19 (3.1) 



-071 (4.8) 



-034(3.9) 



-O 85 (2.7) 



-081 (2.5) 



~062 (3.5) 



-069 (4.6) 



-064(0.0) 



-074 (8.0) 



-O 83 (2.1) 



-O 74 (4 2) 



-061 (22) 
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INDICATORS 



Task Descriptor 



To determine whether three solutions contain glucose, starch, or glucose and starch using 
indicators for glucose (test strip) and starch (iodine solution), 

Equipment/Material 

Three dishes labelled A, B and C containing the standardized, unknown solutions. Glucose test 
strips and iodine solution in a dropper bottle. 



Student Instructions 

The glucose test strip will turn from yellow to green on contact with a solution containing 
glucose and the iodine solution will turn blue-black when starch is present. The dishes A, B 
and C contain three different solutions which you are to test for glucose and starch using the 
indicators. Take the dish filled with solution A and dip the glucose test strip into it. Let the 
test strip dry. Add a drop of iodine solution to dish A. Observe all the results, report what 
solution A contains and repeat for solutions B and C, 

Scoring Scheme 

Credit was given for identifying glucose only in solution A, starch only in solution B, and 
glucose and starch in solution C. 

Comments 

• The differences in performance among countries and provinces were substantial in all three 
tasks. For each task, the difference in the scores of the highest and lowest performing 
populations was at least 20 points, 

• Success rates in identifying the solution containing only glucose were highest, averaging 68 
percent correct across participating countries and provinces, Those^for the starch-only 
solution and the mixture of both averaged 53 and 47 percents correct, respectively. 
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Percentage of Correct Responses (with Standard Errors) 



Alberta 



British Columbia 



Nova Scotia 



Ontario-English 



Ontario-French 



Saskatchewan-Engiish 



Saskatchewan-French 



England 



Scotland 



Soviet Union 



Taiwan 



-#58(3.9) 



-<:'51 (2.6) 
■<r)49 (4.0) 



-#61 (4 5) 



-0 53 (6 2) 



-<i)37 (4 9) 



-#67 (3.0) 



•048 (4.1) 



-(541 (4.0) 



-060 (3 1) 



-053 (4 1) 



-042(4.1) 



>58 (4.3) 



-041 (2.8) 



-^36 (4.1) 



-•73 (2 7) 



-061 (2.9) 



-040(2.0) 



-#71 (0.0) 



(0.0) 



-047 (0.0) 



>78 (6.7) 



— 056 (5.1) 
-054(7.4) 



► 79 (3.0) 



-068 (2.7) 



-053 (3.2) 



-•76(1 5) 



-071 (3.2) 



-061 (3.1) 



—•72 (3.0) 
-071 (2.1) 
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FLOAT 

Task Descriptor 

To select correct observations about flotation trom two sets of objects. 
Equipment/Material 

Two small glass jars labelled X and Y containing clear liquids and identical plastic toys, one 
floating (in jar X) and one submerged (in jar YV 




Student Instructions 

Look carefully at the two jars - you may pick them up. Five other students looked at these 
jars and made the following statements. Which statements are observations, that is, they 
describe what the student actually saw? 

A. I see a toy floating in jar X. 

B. I see a toy floating in jar Y. 

C. I see a toy in jar X that is made of a different plastic than the toy in jar Y. 

D. I see jars containing colourless liquids and coloured tovs. 

E. I see a toy in jar Y that is heavier than the tov in jar X. 

Scoring Scheme 

Credit was given for circling correct statements A and D and not circling incorrect statements 
B, C and E. 

Comments 

• The percentages of students who circled both correct statements and none of the incorrect 
ones were low. ranging from 10 to 34 percent. 

• Most students recognized statements A and D as obserA'ations. Afmost all students 
recognized that statement B, the opposite of statement A. is not an observation. Most 
students recognized statement C. that the two toys were made of different plastic, is 

an incorrect statement, probably because the toys looked so similar. However, statement E, 
that the mass of the toys were different, proved attractive to many students and they circled 
it, even though they had no way of knowing the mass of the two' toys. 
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Percentage of Correct Responses (with Standard Errors) 



Albirta 
BriUth Coiumbli 
Novi Scotia 
Ontario*Engiiah 
Ontirio-Frinch 
Satfcatchfwin-Ejigtith 
Satlcitctiflwin-Frtnch 
EnQiand 
Scotland 
Soviit Union 
Taiwan 



-# 2A (5.0) 
—•25 (3.0) 
#34 (3 7) 



125 (2.6) 
»25 (3.7) 



-•21 (2 3) 



» 20 (0.0) 
• 20(4 1) 
-•22(2.9) 



MO (2.3) 



-• 34 (3.0) 



30 
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Percentage of Students Recognizing Incorrect Statements (with Standard Errors) 



AJlMfta 

Bdtlah Columljla 
Novt Scotia 
Ontarlo-EnQlith 
Ontario-Frtnch 
Satkatchawan-Engiith 
Satkatchtwan-Fnnch 
England 



>91 (2.1) 



-061 (6.7) 



-091 (2.5) 



-065(4.0) 



•92(1.7) 

-088 (2.7) 



Scotland 



Soviet Union 



Taiwan 



-052 (4.8) 



^91 (1 5) 

-085 (2.4) 



-045 (2 4) 



•88 (2 1) 

-085(1 6) 



-0 52 (4 6) 



-•85 {3.2} 
— 036(2.5) 

>92(1 4) 



>^38(2 9) 



-O 79 (2.7) 



188(0 0) 



— 0 39 (0.0) 



-084 (00) 



-038(5.8) 



-074 (3.2) 



-•96(3 2) 



-034 (3.0) 



-O 83 (2.6) 



194 (1 3) 



-035 (4 2) 



--075 (3.7) 



► 82 (4.0) 



-051 (2.7) 



-0 78 (2.9) 



-•96(1 2) 



1 \ ^ 
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TABLET IN WATER 



Task Descriptor 



To observe and record all the changes which take olace when a tablet dissolves in water. 
Equipment/Material 

Water supply, plastic cup. and fruit-flavoured, coloured fizzy tablets. 



Student Instructions 

Observe what happens when the tablet is in the water. Write as many different things as you 
notice. 

Scoring Scheme 

Credit given for all appropriate visual, auditory, and olfactory changes recorded. 
Comments 

The changes that were recorded by most students were in the size of the tablet, the colour 
of the water, and the bubbling of gas. These are all visual changes and il may be that the 
use of the word "obser\'e" in the students' instructions biassed their responses towards such 
changes. However, there were substantial differences in the reporting of different visual 
changes and among different countries and provinces. 

A notable feature was a wide range in the reporting of the fizzing sound as the tablet 
dissolved, from 3 percent in Taiwan to 50 percent in Nova Scotia. 

At least one-half of the students in participating countries and provinces mentioned four or 
more observations, except in the Soviet Union and Taiwan, where the percentages were 45 
percent and 34 percent, respectively. 





49 



Percentage of Students Mentioning Correct Observations (with Standard Errors) 



Albirti 

British Columbia 
Nova Scotia 
Ontario-EnQllsh 
Ontario-French 
Sati(atchewan-En9llsh 
Saskatchewan-French 
England 
Scotland 
Soviet Union 
Taiwan 



-•37(7 1) 



-C)61 (6 7) 



-#24 (3 \) 



-•19(2.9) 



-O 74 (3 5) 
081 (3 0) 



-• 35 (3 1 ) 



^ ;G5 {3 1) 



-•47(3 0) 

050(3 0) 



-•18 (3 4) 



-C382 (3 5) 



-068 (0 0) 



-•37 (6 0) 



-0 62 (6 8) 



M2 (4 5) 



-0 57 (4 6) 



-•52 (3.7) 



-045(4.1) 



-034 (3.5) 
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Percentage of Students Mentioning Specific Observations (with Standard Errors) 



AltMftt 



British Columbia 



Nova Scotia 



Ontario-English 



Ontario-French 



Saskatchewan-English 



Saskatchewan-French 



England 



Scotland 



Soviet Union 



Taiwan 



~Ol3 (2.8) 



-036(4 8) 



-014 (2.2) 
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-O 50 '3.0) 



-0 28 (2.5) 



-O 19 (2.3) 
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-015(2 3) 
©24 (2 4) 
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-046 (0 0) 
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»B3(2.4) 
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SEEDS 
Task Descriptor 

To categorize two different types of seeds according to their 
Equipment/Material 



size, shape and colour. 




0 




Student Instructions 
Scoring Scheme 



ners. 



Problems 

coZr?bt^::L';':^o Hh'" " P--^ -P^-'^'^ 'o obtain 

firs, part of the task, cmegor'lg "ed X "'^"'"^ the 

Comments 

In general, high proportions of the students were able to assien seed^ Y m th. 
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Percentage of Correct Responses for Container Y (with Standard Errors) 

^ 78 (6.2) 



Alberta 
British Columbia 
Nova Scotia 
OntarlQ-EnQilsh 
Ontario-Prflnch 
SaskatctiBwan-Engiitli 
Saskitchawan-Franch 
England 
Scotland 
Sovist Union 
Taiwan 



-024 (2.8) 
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-#89 (2.3) 
#92 (2.2) 



> 86 (0.0) 
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-•73(3.2) 



-^70 (5.6) 



-#82 (2.6) 
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Appendix: 
Problem Solving 
In Mathematics 

In order to provide a more detailed picture of the rich information 
produced from the performance assessment, this appendix describes the 
outcomes produced by Scottish students on one of the problem solving tasks in 
the mathematics circuit. Results are based on unweighted analyses of student 
responses. 

The station in the mathematics circuit named Clay required students to 
produce a 15g lump of plasticine (modelling clay). To achieve this they were 
provided with a large lump of plasticine, a two-pan balance with a centering 
needle (but no weight scale), and two masses, 50g and 20g. Almost all 
Scottish students produced a lump of plasticine intended to be 15g and in most 
cases an explanation of how it was produced. Fifty percent of students 
produced a plasticine lump ^within ± 2g of the required mass, the agreed 
tolerance for a correct response. Many of the explanations provided were 
interesting, but some could not have produced the 15g! 

What strategies were adopted by students? 

Two types of strategy were used to produce lumps of plasticine of mass 
15g estimation and precision approaches. 
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Estimation Approaches. Some 25 percent of the Scottish students used 
an estimation approach involving only the 20g mass and the balance to produce 
a plasticine lump less than 20g. Some of these students said they removed 5g 
once the plasticine lump balanced the 20g mass, others said they removed 
one-quarter of-it or left three-quarters of it. Clearly, the adoption of an 
estimation procedure in these cases was not due to any lack of ability to make 
the necessary calculations for a precision approach. 

"I made my I5g lump of plasticine by weighing out 20g 
and taking 5g away." 

However, one student did not get even this correct. 

"Found out 20g and then took a 3rd of the plasticine 
away." 

Others may not have been able to calculate correctly: 
"I adjusted the plasticine until their was 15g." 

"I kept taking bits from the poke and tryed each time of 
the scales." 

Other students gave various explanations which provided only spurious 
accuracy. For instance, they mentioned deviations in the position of the scale 
pans and in the centering needle. 

"I placed a 20g mass in one pan and measured the 
plasticine until it was 20g and then bits off until it was 
3/5 of the way to the centre line." 
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Others gave more exotic but even less meaningful explanations: 

"First I weighed a lump of plasticine against a 50g mass 
to find out if it was more or less than 5g. It was less, so 
I weighed it against a 20g mass and just kept pulling bits 
off and putting bits on the plasticine until the 20g was 
just a bit heavier than the lump of plasticine." 

First I found a lump that weighed 20g, gradually taking 
small pieces of. When I thought I had 15g I put the 50g 
in one and 20g and plasticine into another, putting extra 
pieces into it soon reached 50g with another 20g added." 

One or two students seemed to carry estimation to extreme: 

"You get some plasticine and roll it in your hands and 
make it round." (This student produced a 20g lump of 
plasticine). 

One student started off promisingly in what seemed a precision strategy 
but in the end relied on estimation. 

"I made 50g of plasticine and then took 20g away from 
it. 15g is half of 30g so I found out what 20g and lOg 
looked like and estimated how much plasticine to add to 
my lOg ball." 

Precision Approaches. Use of the 20g mass led 22 percent of the 
students into precision approaches. Having obf^.ined the 20g of plasticine, they 
then systematically reduced this to 15g by: 

halving the 20g lump, then halving one of the lOg lumps 
and adding a 5g lump to the first lOg lump, or 

halving the 20g lump, halving both the lOg lumps and 
then combining 3 of the 5g lumps. 

The first, more efficient, approach was used by 14 percent of students, 
the second by 8 percent. 
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The most popular precision method involved the use of the 50g and 20g 
masses in opposite pans on the scales. With this approach, used by 26 percent 
of students, it was possible to make a 30g lump of plasticine in the first 
weighing operation, and then halve it. Of the students using this approach, 
only 41 percent specifically mentioned using the scales for the halving process, 
although more may have done so in practice. 

Two other students seemed to start with this approach but failed to 
halve the 30g lump, and so failed the task, 

A variation on this approach, used by six students, was to use the 50g 
mass to produce a 50g lump of plasticine. The 20g mass was then weighed 
against plasticine removed from this lump, until 30g of the lump was left. This 
was then halved. 

Use of a precision approach was usually associated with a good written 
description. However, there were exceptions as one student's explanation 
shows: 

"First a put the 50g weight in one pan and the twenty in 
another then a I balanced it which maid thirty then I halft 
it." 

One enterprising student seemed to set out using this approach but with 
success within his/her grasp used it only as an entry into the simpler precision 
approach involving only the 20g mass: 

"50g in left balance, 20g in right balance, added into right 
until it was 50g also, took out 20g from right, 30g left, 
put 20g in left balance, took out from right until equal, 
stuff taken out was lOg, broke 20g one up till both were 
lOg, divided the lOg into two and weighed the bits till 
they were equal, added lOg bit to 5g bit = 15g." 

Convoluted, but the student achieved success - and a very long sentence! 
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Another 11 students used precision approaches of varying complexity. 
For instance, at one end of the scale: 

"I used the 20g first and made two 20g balls. I then 
made a lOg ball using the 20g weight. I then measured 
all the balls made with the 50g weight then took the lOg 
and made another 5g ball, and stuck the lOg and the 5g 
ball together." 

At the other end of the scale, one student used the 20g mass to produce 
two 20g plasticine lumps. He/she halved one of these, added it to the other 
and halved the resulting 30g lump. 

Two students used both the 50g and 20g masses to produce a 70g lump 
of plasticine, halved it to give two 35g lumps, then used the 20g mass to 
remove 20g from it, leaving the required 15g. 

What does this mean? 

Almost all of the students tackled this problem and produced a lump of 
plasticine, over three-quarters of the lumps being between lOg and 20g. The 
task instructions did not ask for the lump of plasticine to be precisely 15g, and 
it is interesting to speculate whether this would have changed the proportions 
of students using estimation and precision approaches. In some countries and 
provinces, one or the other approach was predominant for instance in Taiwan 
and Scotland about twice as many students used precision approaches as 
estimation ones. In contrast, between two and four times as many students in 
Alberta and Saskatchewan used estimation approaches as compared to precision 
ones. 

The explanation for these differences presumably lies, at least in part, in 
the curriculum and teaching methods in the countries and provinces involved. 
However, our overall background variables can do nothing to illuminate such 
relationships, as neither problem solving in small groups nor students saying 
that "knowing how to solve problems is as important as getting the correct 
answer" appear to be correlated with the approaches used in this task. 
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In Scotland, the most popular approach used by 26 percent of students 
was also the most economic in terms of weighing operations and arguably the 
most complex conceptually. This involved halving a 30g lump obtained by 
using both the 50g and 20g masses on opposite pans of the scales. The simpler 
halving of 'a 20g lump obtained by straight weighing using the 20g mass and 
then halving again, before recombining, attracted 22 percent of Scottish 
students. Estimation using the 20g mass attracted 20 percent of the Scottish 
students. One can only speculate whether the increasing emphasis put on 
problem solving in the Scottish school curriculum in recent years influenced 
these proportions. A similar exercise at age 9 might have illuminated the 
influence of intellectual development on students' problem solving capabilities. 
Next time perhaps! 
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